4 



LETTERS TO NATURE 



isolated repeats show < I07o overlap with our current set), we 
have reached the point of diminishing returns. The map covers the 
entire mouse genome, with the markers being sufficiently abun- 
dant, polymorphic and stable to allow the mapping of monogenic 
or polygenic traits in virtually any mouse cross of interest 3 ' . 
Moreover, the markers are sufficiently dense to facilitate posi- 
tional cloning of most mouse mutations. With > 90% of the mouse 
genome being within 750 kb of a marker, and current mouse yeast 
artificial chromosome (YAC) libraries 1415 having a mean insert 
size > 750 kb, the map affords ready access to the vast majority of 
the genome with little need for chromosomal walking, and 
provides a preliminary scaffold for constructing a genome-wide 
physical map 1 *. 

The map also provides a common framework for the map- 
ping of mutations and cloned genes. In addition to our inte- 
gration with the Frederick cross, the SSLP map is being used as 
a framework for other mapping crosses, including public 
resources at the Jackson Laboratory 17 and the European 
Collaborative Interspecific Backcross (EUCIB) 18 . The EUCIB 



XP-0021 10732 



Project: the construction of dense genetic maps of mouse and 
man. O 



Receded 23 Ooooer 1995: accepted 19 February 1996. 

L Cooetand. N. G. ef at. Soence 2*2. 57-66 (1993). 

2. Cooetand, N.G. et at. Soence 1*1. 67-82 (19931. 

3. t>«ncn,w.F.et jt.GenewilJl. 423-447 (19921. 

4. Owtnch. W. F. et at. r. Gtntoc Maps 1992 (ad. OBnen. S.J 4.110-4. id 2 (Cow Spnn» Hartx* 
Laboratory Press. NY. 19921. ^ , 

5. Mtfter. J. C. er af. in Geoeec Vanan* and Strwns of the Lafiorato/y Mouse 3rd edo. led* lyortv 
M F. & Sear*. A.I (Oxford Urw. Press, New Yarn. 1994). 

6. Otewcft. w. F. er sawe Genet 7, 220-2*5 (1994). 
7 lander. E. S. ef #(. Genormcs 1. 174-181 (lMD. 

a. Uncotn. S. E. A lander. £. S. Genomics J4, 604-610 (1992). 
9. Copetand. N. G. & Jenkins, N. A. TrenOS Genet 7, 113 (1991). 

10. Cea. J. 0. et a*. Genonva 1, 699-709 (1989). 

11. Buctiberf. K M. ef aJ. Genetics 122, 153- 161 (1989). 

12. Jacob. H. J. ef a). Nature Genet, t, 63-69 (1995). 

13. Dib. C. etal. Natum S»0, 152-154(1996). 

14. Lann. £. Monaco. A. P. & lenrach. H. Pnx. nam. Acad. So. USA M. 4123 (1991). 

15. Kuswm. K.ef ti.Mamm. Genome 4, 391-392 (1993). 

16. Hudson. T. «t a/. Science 270. 1945-1955 (1995). 

17. Row*. l_ B. et a*. Mamm. Genome 253-27* (1994). MT „ Mji4 
IS The Eurooean Sacfccross CottaoofstNe Graup. Hum. mo*ec Genet J, 621-627 (1994). 



0 . }GZ-J$H = v 




A comprehensive genetic map 

of the human genome 

based on 5,264 microsatellites 

Colette Dib*, Sabine Faure*, Ceclle Flzames*, 
Dolphlne Samson*, Nathalie Drouot*, Alain VlgnaP, 
Philippe Mlllasseau\ Sophie Marc*, J am lie HazaiT, 
Eric Seboun*, Mark Lathropt, Gabor Gyapay*, 
Jean Morlssette** & Jean Weissenbach^ 

• Genethon and CNRS URA 1922, 1 rue de I'tntemationale. 91000 Evry, 
France 

t INSERM U358, Hopital Saint-Louis, Pans. France 

t Centre de Recherche du Centre Hospitatier de I'Unrversite Laval, 

Quebec G1V 4G2. Canada 

§ To whom correspondence should be addressed 

The great increase in successful linkage studies in a number of 
higher eukaryotes during recent years has essentially resulted 
from major improvements in reference genetic linkage maps 1 "*, 
which at present consist of short tandem repeat polymorphisms 
of simple sequences or microsatellites'- 1 . We report here the last 
version of the Genethon human linkage map'. This map consists 
of 5^264 short tandem (AC/TG). repeat polymorphisms with a 
mean heterozygosity of 70%. The map spans a sex-averaged 
genetic distance of 3,699 cM and comprises 2,335 positions, of 
which 2,032 could be ordered with an odds ratio of at least 
1,000: 1 against alternative orders. The average interval size Is 
1.6 cM; 59% of the map is covered by intervals of 2 cM at most and 
1% remains in intervals above 10 cM. 

Microsatcllite markers were obtained as described previously 5 *. 
A heterozygosity above 0.5 was observed for 93% of the markers 
and above 0.7 for 58%. These values remain very close to those of 
our previous version*. Average heterozygosity per chromosome 
varied from 0.65 (chromosome X) to 0.73 (chromosome 19;, with 
a mean value of 0.70 for the entire collection of markers (Table 1). 
Database sequence comparisons and searches detected matches 
of AFM (Association Francaise contre les Myopathies) markers 
with 19 genes and 74 anonymous markers. 

152 



Genotyping of the microsatellite markers was performed as 
described previously on the same eight CEPH (Centre d'Etudes 
du Polymorphisme Humaine) families (20 for the X chromosome), 
which comprised a total of 134 individuals and 186 meioses 56 (304 
individuals and 291 meioses for the X chromosome). Genotypes 
were submitted to the same error-checking procedures as 
reported earlier 6 . These procedures consisted of ( I ) a reinvestiga- 
tion of families with abnormally elevated recombination frequen- 
cies between pairs of markers, and (2) correction or elimination of 
ail double recombinant genotypes of markers placed in short 
linkage intervals. Such apparent double recombinations probably 
result from mutation events that convened an allele of one 
individual into the other allele. A more detailed analysis of 
double-recombination events and mutations in microsatellites is 
in preparation. 

Map construction was done in a stepwise manner with multiple 
controls at each step. The total length of this map as evaluated 
from the CILINK algorithm 9 is 3,699 cM (Table 1). This is almost 
identical in length to our previous version, despite the addition of 
new terminal markers that extend the 93/94 chromosome maps by 
145 cM (4%). The absence of increase in length probably results 
from a very thorough error-checking process and from elimination 
of apparent double-recombinant genotypes. The 5,264 markers 
are distributed in 2,335 positions (Fig. 1), 2.032 of which are 
ordered with odds ratios against alternative orders of at least 
1,000 : l.The mean interval size is 1.6 cM. The fraction of the map 
in intervals above 10 cM represents only 1 per cent of the total 
linkage distance and consists of 3 intervals spanning 1 1 cM. Fifty- 
nine per cent of the map is covered by intervals of 2 cM at most, 
and 92 per cent by intervals of 5 cM at most. Markers from the 
CEPH and CHLC databases have been integrated into this map as 
shown in Fig. 2, which presents the map of chromosome 22 as an 
example. Detailed information, including integrated maps of all 
chromosomes, a list of markers, their primer sequences, hetero- 
zygosity, number and size-range of alleles observed in the 8 (or 20) 
genoryped CEPH families, sex-specific distances, and mutations, 
will be presented in an extended reprint available on request and 
on an electronic server (http://www.genethon.fr). 

The total sex-specific lengths of autosomes estimated by 
CILINK 9 show only slight variations when compared to the 
lengths of the previous map 6 . The length excess observed for the 
female map is comparable to other published maps. This excess 

NATURE • VOL 380 ■ 14 MARCH 1996 



8NS0OCID:<XP 2110732A> 



LETTERS TO NATURE 



BLE 1 Quantitative characteristics of maps and markers by chromosome 



Chromo- 



Physical 
length* 



Number of 
markers 



some 


(Mb) 


mapped 


1 


263 


461 


2 


255 


452 


3 


214 


353 


4 


203 


280 


5 


194 


312 


6 


183 


311 


7 


171 


272 


8 


155 


249 


9 


145 


189 


10 


144 


281 


11 


144 


273 


12 


143 


249 


13q 


98 


164 


14q 


93 


162 


15q 


89 


145 


16 


98 


180 


17 


92 


186 


18 


85 


136 


19 


67 


121 


20 


72 


144 


21q 


39 


61 


22q 


43 


67 


X 


164 


216 


Genome 


3.154 


5.264 



Markers 
per 
Mb 

1.75 
1.77 
1.65 
1.38 
1.61 
1.70 
1.59 
1.61 
1.30 
1.95 
1.90 
1.74 
1.67 
1.74 
1.63 
1.84 
2.02 
1.60 
1.81 
2.00 
1.56 
1.56 
1.32 

1.67 





Genetic length (cMit 


Markers 










per 


Mean hetero- 


Female 


Male 


Sex-average 


cM 


zygosity^ 


358.2 


220.3 


292.7 


1.57 


0.71 


324.8 


210.6 


277.0 


1.63 


0.71 


269.3 


182.6 


233.0 


1.51 


0.71 


270.7 


157.2 


212.2 


1.32 


0.69 


242.1 


147.2 


197.6 


1.57 


0.70 


265.0 


135.2 


201.1 


1.54 


0.71 


187.2 


178.1 


184.0 


1.48 


0.72 


221.0 


113.1 


166.4 


1.50 


0.70 


194.5 


138.5 


166.5 


1.13 


0.71 


209.7 


146.1 


181 7 


1.55 


0.70 


180.0 


121.9 


156 1 


1.75 


0.70 


211.8 


126.2 


169.1 


1,47 


0.71 


132.3 


97.2 


117.5 


1.40 


0.70 


154.4 


103.6 


128.6 


1.26 


0.72 


131.4 




110.2 


1.36 


0.70 


169.1 


98.5 


130.8 


1.38 


0.71 


145.4 


104.0 


128.7 


1.44 


0.70 


151.3 


92.7 


123.8 


1.10 


0.70 


115.0 


98.0 


109.9 


1.10 


0.73 


120.3 


73.3 


96.5 


1.49 


0.70 


70.6 


46.8 


59.6 


1.02 


0.70 


74.7 


46.9 


58.1 


1.15 


0.71 


198.1 




198.1 


1.09 


0.65 


4.396.9 


2,729.7 


3.699.2 


1.38 


0.70 



Ordered 
1,000 : 1§ 

142 
160 
199 
103 
117 
116 
116 

93 

67 
101 
104 

95 

61 

70 

52 

75 

68 

50 

51 

52 

27 

32 

81 

2.032 



-Physical lengths are from ref. 18. t Genetic lengths were determined using the CI LINK program of the LINKAGE package 3 , i Heterozygosities were determined using genotypes 
of 56 autosomes (or 42 X chromosomes) from unrelated caucasoid mdrv.duals (CEPH grandparents of families 884. 1331. 1332. 1347. 1362 and 1416 and parents of families 
102 and 1413). § 1.000 : 1 oods order was determined as for Rg. 1. 



FIG. 1 Genetic linkage map of the 
2.335 positions defined by 
5.264 markers. The vertical bar 
delimits the length covered by 
the map as computed by the 
GMS algorithm 13 and represents 
sex-averaged genetic distances 
m cM. Honzomal marks represent 
map positions of AFM micro- ■ | 
satellite markers. All distance 
values between positions were 
rounded to integer values and 1 
distances between 0 and lcM 
were rounded to 1 cM. Slight dif- 
ferences may appear between 

distances evaluated using GMS -*~I B = || 
(this figure.) and CIL1NK displayed ■ *i.|3iLs 
m Table 1. This reflects the fact | 
that clusters of non-recombmant 
markers were processed in a dif- 
ferent manner by each program. 
Maps were constructed usmg the 
automated map construction 
algonthm MultiMap- 0 based on 
CRIMAP^ 1 . The order proposed 
by MultiMap (primary map) was 
further submitted to another 
algonthm. GMS. based on the 
LINKAGE package 9 . For markers 
not ordered with the support of 
1.000:1 odds on the pnmary 
map. the most likely order was 
chosen. Markers for which there 
were rwo equally probabie and 

'most likely' positions usually did not recombine with another marker situated between 
those two positions. Absence of recombination between such markers was verified 
and they were submitted as non-recombmants for the GMS computation. The order of 
loci proposed by the GMS algonthm (secondary map) was further checked for double 
recombinants. Given the high density of markers, these double recombinant geno- 
types were very unlikely and therefore subjected to resconng. Among a total of 874 
double recombinant genotypes rescored. 7 1% were reassessed. A second genotypmg 
enpenment was earned out for suspicious genotypes. This procedure was used until 
the latest secondary order no longer revealed new double recombinant genotypes. The 
GMS algonthm usually computes sets of 15 to 20 markers from segments of a 
chromosome. For the final GMS computation, these segments were defined manuairy 
to maximize the number of positions that could be ordered with 1,000 : 1 odds. A totai 
of 128 double recombinants could not be corrected. These apparent double recom- 
binations occurred in very small intervals surrounded by several markers of the 
alternative phase. Moving them by a few cM did not solve the problem or resulted in 
an increased number of alternative double recombinant genotypes on other haplo- 
types. Because such apparent double recombination events are most probably 
mutations, they were eliminated from the dataset for the final map calculations. 

NATURE - VOL 380 • 14 MARCH 1996 



• 10 11 12 13 14 IS It 17 It 19 20 11 22 X 



|10CM 



also appears on the X-chromosome map where the interval 
density is lower. More generally, visual examination of the map 
suggests that the interval distribution is not uniform (Fig. I). In 
particular the interval sizes appear larger in a number of telomeric 
and subtelomeric regions. This might reflect an enhanced recom- 
bination frequency in such regions. To study the distribution of 
genetic markers throughout the human genome, we designed a 
simulation model that could account for variations resulting from 
marker identification and typing processes. On average, the 
observed numbers are close to the simulation using a density of 
two highly informative (AC)„ microsateliites per cM without 
significant bias in the observed marker distribution (results not 
shown). 

The physical sizes of the largest gaps were estimated by 
measuring the number of radiation-induced breaks in a panel of 
whole-genome radiation hybrids developed recently 10 . Assuming 
that breakage distances of radiation hybrid maps are representa- 
tive of physical distances, 3 out of 4 large genetic intervals tested 
are notably smaller than expected from the linkage distance 
(results not shown). This suggests that a number of the gaps 
remaining on the genetic map correspond to regions with 

153 



BNSOOCID:<XP 2110732A> 



LETTERS TO NATURE 



Rumi ivrvn 





J WIWMMl 

1 1*7*1**- 



enhanced genetic recombination rather than to segments devoid 
of (AC)„ microsatellites. However, as seen for one of the gaps 
analysed, some intervals may represent actual physical gaps. 
Similarly, clusters of non-recombining markers can be separated 
by distances estimated to be of the order of several megabases. 



FIG. 2 Integrated genetic linkage map of chromosome 22. All chromosome 
maps are presented in the extended reprint, as illustrated in this example 
for chromosome 22. Trie leftmost column gives names of genes or non- 
AFM anonymous loo from the CEPH/CHLC databases. These loci were 
positioned by the MultiMap algonthm at 1.000: 1 odds with respect to a 
framework of AFM markers. Gene names framed by brackets (PL2RB11) 
indicate genes for which a microsatelhte was specifically developed as a 
genetic marker. These gene markers were genotyped, error-checked and 
positioned using the same criteria as for AFM markers. The vertical lines on 
the left indicate the intervals in which genes or non-AFM anonymous loci 
from the CEPH/CHLC databases could be placed by MultiMap at odds 
> 1,000 : 1. The thick vertical bar delimits the length covered by the map as 
computed by the GMS algorithm. Some loci were placed at 1.000 : 1 odds 
outside the AFM framework; they are depicted on the top and/or bottom of 
the maps. Because genotypes of these loci were not submitted to the same 
quality control procedures as AFM markers, only relative positions, but not 
distances are presented, that is, their distance from other markers and the 
length of the interval corresponding to their position is arbitrary. AFM 
markers are displayed in the right part of the map from top to bottom 
according to the GMS defined order and distances. Markers appearing on 
the same line are markers with no recombinations in the informative subset 
of the 8 CEPH families tested. The order of AFM markers on a line is 
arbitrary. Numbers to the nght of the thick bar between consecutive AFM 
markers correspond to sex-averaged genetic distances in cM evaluated by 
GMS between consecutive AFM markers. All values compnsed between 0 
and 1 cM were rounded to 1 cM, The vertical segments in front of the name 
of AFM markers indicate groups of markers for which the order could not be 
resolved with odds > 1,000 : 1. 

METHODS. Integration of other loci: sequence comparisons and searches in 
databases detected matches of AFM markers with 19 genes and 74 
anonymous markers. These genes or loci, which had been independently 
assigned to the same chromosomes (except in one instance), could thus be 
precisely positioned on the present maps. We have attempted to include 
markers from the CEPH and CHLC databases in our maps. A framework map 
based on 1,844 AFM markers ordered at 1.000 : 1 odds was established 
using the MultiMap algorithm. An additional 186 non-AFM markers could 
be positioned at 1,000 : 1 odds on this predefined MultiMap framework and 
are indicated on the maps. 



The highly informative microsatellites enabled us to order more 
than 2.000 markers with an odds ratio of 1,000:1 with the 
genotyping of only eight reference CEPH families representing 
a total of 186 informative meioses at best. But we gave priority to 
marker density rather than to degree of resolution. With this 
dense map and the additional genetic mapping resources avail- 
able 211 " 15 , linkage mapping of monogenic diseases can be readily 
carried out to the centimorgan level in most instances. In addition, 
the availability of numerous highly informative markers will also 
serve to define regions of linkage disequilibrium in which a 
founder haplotype can be reconstituted in genetically isolated 
populations. The genome-wide searches for loci involved in 
complex diseases will also considerably benefit from this new 
resource. In addition, this map provides the scaffold for the 
construction of physical maps based on overlapping sets of yeast 
artificial chromosomes that have been recently published 16 1? . □ 



Recerved 6 October 1995; accepted 8 January 1996. 

1. Dietrich. W. F, et al. Nature Genet. 7, 220-225 (1994). 

2. Murray. J. C et at. Science 2M, 20*9-2070 (1994). 

3. Barendse. W. et al. Nature Genet, t, 227-235 (1994). 

4. Jacob, H. J. et at. Nature Genet. #, 63-69 (1995). 

5. Weissenbach. J. et at. Nature 111, 794-801 (1992). 

6. Gyapay. G. ef at. Nature Genet 7» 246-339 (1994). 

7. Weber. J. L. & May, P. E. Am. J. hum. Genet. 44, 388-396 (1989). 

8. Utt. M. 4 Luty. J. A. Am. J. num. Genet. 44, 397-401 (1989). 

9. Lamroo. G. M. & LaJoueJ. J. M. Am. / num. Genet. M, 460-465 (1984). 

10. waiter. M. et at. Nature Genet. 7, 22-28 (1994). 

11. Sourr. N. K. et at. Eur. J. num. GeneL 2, 193-252 (1994). 

12. GerKen, S. C et al. Am. J. hum. Genet M, 484-499 (1995). 

13. Eisner, T. l. f Albertsen. H.. GerKen, S. C GarrwngM. P. A White, R. Am. J. hum. Genet M, 500- 
507 (1995). 

14. Piaetke. R. & Scnacmel. G. Am. / hum. Genet. M, 508-518 (1995). 

15. Utah Marker Devefooment Group Am. J. hum. Genet V7* 619-628 (1995). 

16. Chumakov. I. M. et at. Nature 177, 175-297 (1995). 

154 



17. Hudson. T. J. eta/. Science 270, 1945-1954 (1995). 

18. Morton, N. E. Proc. natn. Acad. Sci. U.S.A. U, 7474-7476 (19911. 

19. Lathrop, G. M. et ai. Genomics 2, 157-164 (1988). 

20. Mause. T. C, Pemn. M. & Chakravartj. A. Nature Genet «, 384-390 (1994). 

21. Lander. E. S. & Green, P. Proc. natn. Acad. Sci. U.S.A. 84, 2363-2369 (1987). 



ACKNOWLEDGEMENTS. This work was initiated at CEPH and results from discussions with 0. Cohen, 
ti was supported Dy the Association Francarse con tie les Myooathies. the G/oupement de 
Recnercnes et d'Etudes sur les Genomes and European Union (B>omedl). We acknowledge tne 
technical and clencat contribution of L Baron, N. Becuwe, M. Besnard-Gonnet. I. Bordeiais, 
C. CaJousuan. C Cruaud. M. Dubois, C. Oumont. C. Oupraz-Ramet, E. Ernst. K. Fortsat. M. Francos, 
j, L Mangua. C. Marquette. M. Mauge, E. M. Bene. M. Meugmer. 0. Musetet. S. Nguyen. 5. Pezard. 
M. Trancnant. N. Sunn. N. Vega. E. Wunderte and V. Wundene. The group of summer students 
substantially contributed to the data collection and error checking. We thank the informatics team of 
Genetnon, particularly X. Bemgni. L. Bougueieret. C. Discala. ;.-M. Froussard. R. Gavrel, P.Gesnoutn. 
S. Poo**. P. Rodnguez-Torne. L. Sainte-Marthe, C. ScarpefU and G. Vaysseu. We also thank A. de Sano 
and G. Bemardi for the DNA isochore fractionation. S. Cure for help m writing the manuscnpt. and 
G. Petrano for continuous devobon to this project. 

NATURE • VOL 380 • 14 MARCH 1996 



BNSDOCI0:<XP 2110732A> 



